Goto

Collaborating Authors

 neural network representation



Supplementary material for " Improving neural network representations using human similarity judgments " Anonymous Author(s) Affiliation Address email A Experimental details 1 A.1 Model features 2

Neural Information Processing Systems

Figure A.1: Among all hyperparameter combinations considered in our grid search, a combination of ( We used a compute time of approximately 5600 CPU-hours of 2.90GHz Intel Xeon Gold In this section, we outline our anomaly detection experimental setting in more detail. Given a dataset (e.g., CIFAR-10) with In contrast to the "one-vs-rest" setting, in LOO we define one class of the In both "one-vs-rest" and LOO AD settings, we evaluate model representations in the following way: We show the pairs of items that change the most in distance in Table B.1. "stethoscope", which are semantically unrelated but perhaps have some slight visual similarity, tend We show the results in Fig. B.1. Table B.1: Distances between pairs of individual items from THINGS, ranked by the relative change in cosine The top items move much closer together under naive alignment, while the bottom ones move much farther apart. Figure B.1: How does the global structure of the representations change after alignment?



Improving neural network representations using human similarity judgments

Neural Information Processing Systems

Deep neural networks have reached human-level performance on many computer vision tasks. However, the objectives used to train these networks enforce only that similar images are embedded at similar locations in the representation space, and do not directly constrain the global structure of the resulting space. Here, we explore the impact of supervising this global structure by linearly aligning it with human similarity judgments. We find that a naive approach leads to large changes in local representational structure that harm downstream performance. Thus, we propose a novel method that aligns the global structure of representations while preserving their local structure. This global-local transform considerably improves accuracy across a variety of few-shot learning and anomaly detection tasks. Our results indicate that human visual representations are globally organized in a way that facilitates learning from few examples, and incorporating this global structure into neural network representations improves performance on downstream tasks.


Similarity and Matching of Neural Network Representations

Neural Information Processing Systems

We employ a toolset --- dubbed Dr. Frankenstein --- to analyse the similarity of representations in deep neural networks. With this toolset we aim to match the activations on given layers of two trained neural networks by joining them with a stitching layer. We demonstrate that the inner representations emerging in deep convolutional neural networks with the same architecture but different initialisations can be matched with a surprisingly high degree of accuracy even with a single, affine stitching layer. We choose the stitching layer from several possible classes of linear transformations and investigate their performance and properties. The task of matching representations is closely related to notions of similarity. Using this toolset we also provide a novel viewpoint on the current line of research regarding similarity indices of neural network representations: the perspective of the performance on a task.


Evaluating alignment between humans and neural network representations in image-based learning tasks

Neural Information Processing Systems

We found that while training dataset size was a core determinant of alignment with human choices, contrastive training with multi-modal data (text and imagery) was a common feature of currently publicly available models that predicted human generalisation. Intrinsic dimensionality of representations had different effects on alignment for different model types. Lastly, we tested three sets of human-aligned representations and found no consistent improvements in predictive accuracy compared to the baselines.



Supplementary material for " Improving neural network representations using human similarity judgments " Anonymous Author(s) Affiliation Address email A Experimental details 1 A.1 Model features 2

Neural Information Processing Systems

Figure A.1: Among all hyperparameter combinations considered in our grid search, a combination of ( We used a compute time of approximately 5600 CPU-hours of 2.90GHz Intel Xeon Gold In this section, we outline our anomaly detection experimental setting in more detail. Given a dataset (e.g., CIFAR-10) with In contrast to the "one-vs-rest" setting, in LOO we define one class of the In both "one-vs-rest" and LOO AD settings, we evaluate model representations in the following way: We show the pairs of items that change the most in distance in Table B.1. "stethoscope", which are semantically unrelated but perhaps have some slight visual similarity, tend We show the results in Fig. B.1. Table B.1: Distances between pairs of individual items from THINGS, ranked by the relative change in cosine The top items move much closer together under naive alignment, while the bottom ones move much farther apart. Figure B.1: How does the global structure of the representations change after alignment?


Differentiable neural network representation of multi-well, locally-convex potentials

Jones, Reese E., Tepole, Adrian Buganza, Fuhg, Jan N.

arXiv.org Machine Learning

Multi-well potentials are ubiquitous in science, modeling phenomena such as phase transitions, dynamic instabilities, and multimodal behavior across physics, chemistry, and biology. In contrast to non-smooth minimum-of-mixture representations, we propose a differentiable and convex formulation based on a log-sum-exponential (LSE) mixture of input convex neural network (ICNN) modes. This log-sum-exponential input convex neural network (LSE-ICNN) provides a smooth surrogate that retains convexity within basins and allows for gradient-based learning and inference. A key feature of the LSE-ICNN is its ability to automatically discover both the number of modes and the scale of transitions through sparse regression, enabling adaptive and parsimonious modeling. We demonstrate the versatility of the LSE-ICNN across diverse domains, including mechanochemical phase transformations, microstructural elastic instabilities, conservative biological gene circuits, and variational inference for multimodal probability distributions. These examples highlight the effectiveness of the LSE-ICNN in capturing complex multimodal landscapes while preserving differentiability, making it broadly applicable in data-driven modeling, optimization, and physical simulation.


Improving neural network representations using human similarity judgments

Neural Information Processing Systems

Deep neural networks have reached human-level performance on many computer vision tasks. However, the objectives used to train these networks enforce only that similar images are embedded at similar locations in the representation space, and do not directly constrain the global structure of the resulting space. Here, we explore the impact of supervising this global structure by linearly aligning it with human similarity judgments. We find that a naive approach leads to large changes in local representational structure that harm downstream performance. Thus, we propose a novel method that aligns the global structure of representations while preserving their local structure.